1,573 research outputs found

    ASR decoding in a computational model of human word recognition

    Get PDF
    This paper investigates the interaction between acoustic scores and symbolic mismatch penalties in multi-pass speech decoding techniques that are based on the creation of a segment graph followed by a lexical search. The interaction between acoustic and symbolic mismatches determines to a large extent the structure of the search space of these multipass approaches. The background of this study is a recently developed computational model of human word recognition, called SpeM. SpeM is able to simulate human word recognition data and is built as a multi-pass speech decoder. Here, we focus on unravelling the structure of the search space that is used in SpeM and similar decoding strategies. Finally, we elaborate on the close relation between distances in this search space, and distance measures in search spaces that are based on a combination of acoustic and phonetic features

    Looking at a digital research data archive - Visual interfaces to EASY

    Full text link
    In this paper we explore visually the structure of the collection of a digital research data archive in terms of metadata for deposited datasets. We look into the distribution of datasets over different scientific fields; the role of main depositors (persons and institutions) in different fields, and main access choices for the deposited datasets. We argue that visual analytics of metadata of collections can be used in multiple ways: to inform the archive about structure and growth of its collection; to foster collections strategies; and to check metadata consistency. We combine visual analytics and visual enhanced browsing introducing a set of web-based, interactive visual interfaces to the archive's collection. We discuss how text based search combined with visual enhanced browsing enhances data access, navigation, and reuse.Comment: Submitted to the TPDL 201

    Rule extraction for allophone synthesis:final report ALLODIF

    Get PDF

    Information encoding by deep neural networks: what can we learn?

    No full text
    The recent advent of deep learning techniques in speech tech-nology and in particular in automatic speech recognition hasyielded substantial performance improvements. This suggeststhat deep neural networks (DNNs) are able to capture structurein speech data that older methods for acoustic modeling, suchas Gaussian Mixture Models and shallow neural networks failto uncover. In image recognition it is possible to link repre-sentations on the first couple of layers in DNNs to structuralproperties of images, and to representations on early layers inthe visual cortex. This raises the question whether it is possi-ble to accomplish a similar feat with representations on DNNlayers when processing speech input. In this paper we presentthree different experiments in which we attempt to untanglehow DNNs encode speech signals, and to relate these repre-sentations to phonetic knowledge, with the aim to advance con-ventional phonetic concepts and to choose the topology of aDNNs more efficiently. Two experiments investigate represen-tations formed by auto-encoders. A third experiment investi-gates representations on convolutional layers that treat speechspectrograms as if they were images. The results lay the basisfor future experiments with recursive networks

    Reproducibility of electrical caries measurements: A technical problem?

    Get PDF
    The currently available instrument for electrical detection of occlusal caries lesions {[}Electronic Caries Monitor (ECM)] uses a site-specific measurement with co-axial air drying. The reproducibility of this method has been reported to be fair to good. It was noticed that the measurement variation of this technique appeared to be non-random. It was the aim of this study to analyse how such a non-random reproducibility pattern arises and whether it could be observed for other operators and ECM models. Analysis of hypothetical measurement pairs showed that the pattern was related to measurements at the high and low end of the measurement range for the instrument. Data sets supplied by other researchers to a varying degree showed signs of a similar non-random pattern. These data sets were acquired at different locations, by different operators and using 3 different ECM models. The frequency distribution of measurements in all cases showed a single or double end-peaked distribution shape. It was concluded that the pattern was a general feature of the measurement method. It was tentatively attributed to several characteristics such as a high value censoring, insufficient probe contact and unpredictable probe contact. A different measurement technique, with an improved probe contact, appears to be advisable. Copyright (C) 2005 S. Karger AG, Basel

    Schwa reduction in low-proficiency L2 speakers: Learning and generalization

    No full text
    This paper investigated the learnability and generalizability of French schwa alternation by Dutch low-proficiency second language learners. We trained 40 participants on 24 new schwa words by exposing them equally often to the reduced and full forms of these words. We then assessed participants' accuracy and reaction times to these newly learnt words as well as 24 previously encountered schwa words with an auditory lexical decision task. Our results show learning of the new words in both forms. This suggests that lack of exposure is probably the main cause of learners' difficulties with reduced forms. Nevertheless, the full forms were slightly better recognized than the reduced ones, possibly due to phonetic and phonological properties of the reduced forms. We also observed no generalization to previously encountered words, suggesting that our participants stored both of the learnt word forms and did not create a rule that applies to all schwa words

    High-density SNP association study of the 17q21 chromosomal region linked to autism identifies CACNA1G as a novel candidate gene.

    Get PDF
    Chromosome 17q11-q21 is a region of the genome likely to harbor susceptibility to autism (MIM(209850)) based on earlier evidence of linkage to the disorder. This linkage is specific to multiplex pedigrees containing only male probands (MO) within the Autism Genetic Resource Exchange (AGRE). Earlier, Stone et al.(1) completed a high-density single nucleotide polymorphism association study of 13.7ā€‰Mb within this interval, but common variant association was not sufficient to account for the linkage signal. Here, we extend this single nucleotide polymorphism-based association study to complete the coverage of the two-LOD support interval around the chromosome 17q linkage peak by testing the majority of common alleles in 284 MO trios. Markers within an interval containing the gene, CACNA1G, were found to be associated with Autism Spectrum Disorder at a locally significant level (P=1.9 Ɨ 10(-5)). While establishing CACNA1G as a novel candidate gene for autism, these alleles do not contribute a sufficient genetic effect to explain the observed linkage, indicating that there is substantial genetic heterogeneity despite the clear linkage signal. The region thus likely harbors a combination of multiple common and rare alleles contributing to the genetic risk. These data, along with earlier studies of chromosomes 5 and 7q3, suggest few if any major common risk alleles account for Autism Spectrum Disorder risk under major linkage peaks in the AGRE sample. This provides important evidence for strategies to identify Autism Spectrum Disorder genes, suggesting that they should focus on identifying rare variants and common variants of small effect

    Speech register influences listenersā€™ word expectations

    Get PDF
    We utilized the N400 effect to investigate the influence of speech register on predictive language processing. Participants listened to long stretches (4 ā€“ 15 min) of naturalistic speech from different registers (dialogues, news broadcasts, and read-aloud books), totalling approximately 50,000 words, while the EEG signal was recorded. We estimated the surprisal of words in the speech materials with the aid of a statistical language model in such a manner that it reflected different predictive processing strategies; generic, register-specific, or recency-based. The N400 amplitude was best predicted with register-specific word surprisal, indicating that the statistics of the wider context (i.e., register) influences predictive language processing. Furthermore, adaptation to speech register cannot merely be explained by recency effects; instead, listeners adapt their word anticipations to the presented speech register

    Dealing with uncertain input in word learning

    No full text
    In this paper we investigate a computational model of word learning, that is embedded in a cognitively and ecologically plausible framework. Multi-modal stimuli from four different speakers form a varied source of experience. The model incorporates active learning, attention to a communicative setting and clarity of the visual scene. The model's ability to learn associations between speech utterances and visual concepts is evaluated during training to investigate the influence of active learning under conditions of uncertain input. The results show the importance of shared attention in word learning and the model's robustness against noise
    • ā€¦
    corecore